Add execution mode option to PyHDK python tests #518

kurapov-peter · 2023-06-08T12:01:31Z

The PR introduces a --device option to pytests to allow for switching between execution modes. The default behavior remains unchanged.
It also provides a quick simple fix (by disabling L0 compilation context caching with enable-gpu-code-compilation-cache=false) to the following problem:
When HDK components are handled manually and the data manager is reinitialized the compilation cache may outlive the GPU manager (since its lifetime is bound to the data manager's lifetime). The compilation context would then try to access an invalid instance L0Device with a valid device handle (the driver returns the same handle on consequent calls).

ienkovich · 2023-06-08T16:23:12Z

Is enable-gpu-code-compilation-cache=false a temporary solution?

Also, can we choose a device via pyhdk.init() in test suite setup or in another similar way? It feels wrong that users have to choose an execution device in each run call.

kurapov-peter · 2023-06-09T12:18:28Z

Is enable-gpu-code-compilation-cache=false a temporary solution?

Yep. The data manager could reset the cache when it's destroyed, but that would create a cycle dependency with executor now. Flushing the cache on executor destruction also doesn't seem right. I think the right thing to do is get rid of the device pointer in the compilation context. There's a couple questions for such a change though that I'll need to address (for example: if we only use the device id to route execution what happens if the new executor has less devices available).

Also, can we choose a device via pyhdk.init() in test suite setup or in another similar way? It feels wrong that users have to choose an execution device in each run call.

We could, but why would you? I think in the end we will have a default setting that allows any possible offloading scheme. If there's any problem with a particular query (i.e. the cost model is very wrong on the device preference), you can force it to run with a specific setting.

We also do have the cpu-only setting for pyhdk.init already, so this is not how a user would use the API. We have these options present here only to simplify testing and make it explicit.

ienkovich · 2023-06-09T18:02:50Z

Yep. The data manager could reset the cache when it's destroyed, but that would create a cycle dependency with executor now.

You can do it by adding a new GpuCodeCache interface that Executor would implement and then clear the executor's cache through this interface to avoid the cycle dep. We've used a similar trick with Executor before.

We also do have the cpu-only setting for pyhdk.init already, so this is not how a user would use the API. We have these options present here only to simplify testing and make it explicit.

I don't feel the testing is more simple if I have to add an additional arg to each test function, each sql call, and each run call. It is just a global testing option, why not set it once somewhere?

kurapov-peter · 2023-06-12T11:59:03Z

You can do it by adding a new GpuCodeCache interface that Executor would implement and then clear the executor's cache through this interface to avoid the cycle dep. We've used a similar trick with Executor before.

#520

I don't feel the testing is more simple if I have to add an additional arg to each test function, each sql call, and each run call. It is just a global testing option, why not set it once somewhere?

It's not "each test function", only those that need the option. It is also not a global testing option. It may seem so now because we currently allow CPU fallbacks. Simplicity should come with explicitness. I'd like to forbid fallbacks in tests later on so that we are sure a query runs on the device we asked for. Some of those we will want to fall back. Having a global setting seems inconvenient for that.

I agree that it would be nice to type less. I initially added the cpu-only as a "parameterize" clause for that very reason. Yet setting cpu-only globally changes data manager behavior (we don't create a gpu manager at all), so it's less flexible and doesn't cover the "have-gpu-manger-but-run-on-cpu" case. It also wouldn't allow running both cases in the same env -- you'd need a reset as we don't expose the config.

There're a couple of more-or-less straightforward ways to set the device type once of course. Is it worth it though?

kurapov-peter requested review from alexbaden and ienkovich June 8, 2023 12:01

alexbaden approved these changes Jun 8, 2023

View reviewed changes

alexbaden added the for next release label Jun 27, 2023

kurapov-peter added 5 commits June 27, 2023 19:10

Add execution config parameter to pytest

b386a27

Use the default cpu-only value

7da9888

Add option to disable gpu kernels caching

aa48b18

Add --device parameter to pyhdk tests

8ff9f3a

Remove debug logging

5c8be86

kurapov-peter force-pushed the pakurapo/pytest-l0 branch from 181d417 to 5c8be86 Compare June 27, 2023 19:13

alexbaden merged commit f421fae into main Jun 28, 2023

alexbaden deleted the pakurapo/pytest-l0 branch June 28, 2023 02:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add execution mode option to PyHDK python tests #518

Add execution mode option to PyHDK python tests #518

kurapov-peter commented Jun 8, 2023

ienkovich commented Jun 8, 2023

kurapov-peter commented Jun 9, 2023

ienkovich commented Jun 9, 2023

kurapov-peter commented Jun 12, 2023

Add execution mode option to PyHDK python tests #518

Add execution mode option to PyHDK python tests #518

Conversation

kurapov-peter commented Jun 8, 2023

ienkovich commented Jun 8, 2023

kurapov-peter commented Jun 9, 2023

ienkovich commented Jun 9, 2023

kurapov-peter commented Jun 12, 2023